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Abstract 

In this article we describe Bayesian nonparametric procedures for two-sample hypothesis test- 
ing. Namely, given two sets of samples y^^^ ~ F'^' and y'^' ~ F^^\ with F'^^F^^' unknown, 
we wish to evaluate the evidence for the null hypothesis Ho : F^-^^ = F^^^ versus the alternative 
Hi : F^i^ ^ F^^\ Our method is based upon a nonparametric Polya tree prior centered either 
subjectively or using an empirical procedure. We show that the Polya tree prior leads to an 
analytic expression for the marginal likelihood under the two hypotheses and hence an explicit 
measure of the probability of the null Pr(//o|{y'^'' , y'^-*}). 



1 Introduction 

Nonparametric hypothesis testing is an important branch of statistics with wide applicability. For 
example we often wish to evaluate the evidence for systematic differences between real valued responses 
under two different treatments without specifying an underlying distribution for the data. That is, 

given two sets of samples y^^-* ~ F^^^ and y*^^^ ~ F^'^\ with f'^^F^^^ unknown, we wish to evaluate 
the evidence for the competing hypotheses 

Ho : f(i) = F(2) versus Hi : F^^^ F^^\ 

In this article we describe a nonparametric Bayesian procedure for this scenario. Our Bayesian 
method quantifies the weight of evidence in favour of Hq in terms an explicit probability measure 
Pr(iJo|y''^'^-')j where y(i'^) denotes t he combined data set y''^'^-'= {y'"'^\y^^'}. To perform the test we 
use a Polya tree prior Lavine 1992| where under Hq we have F(1'2) ^ ^(i) ^ ^(2) centered on some 



distribution G^^'^-* and under Hi, F^^^ ^ -^"(2) centered on distributions G^^' and G^^-* respectively. 
The Polya tree is a well known nonpara metric prior dist ribution for random probability measures F 
on n where ft denotes the domain of Y iFergusoii 1974 1 . One advantage of the Polya tree is that it 



exhibits conjugacy which enables us to obtain analytic expressions for the marginal likelihood of Hq 
and Hi given the data. A major motivation of our work was to develop a Bayesian test which is simple 
to implement with default user set parameters and that can be easily understood by non-statisticians. 
This issue is discussed in detail in sections 2 and 4. 

Bayesian nonparametrics is a fast developing discipline. IWalker and Mallickl 1999j | provide a good 
overview of the field including a nice description of the Polya tree prior. While there has been con- 
siderable interest in nonparametric inference there has somewhat surprisingly been little written on 
nonparametric hypothesis testing and most work has concentrated on testing a parametric model ver- 
sus a nonparametric alte rnative (the Goodne ss of Fit problem). Initial work on the Goodness of Fit 
problem was untaken by iFlorens et al. 1996| and Carota and Parmigiani |l99d | who use a Dirichlet 



process prior for the alternative distribution and compare to a parametric model. In this case, the 
nonparametric distributions will be discrete and the Bayes factor will include a penalty term for ties. 
The method can lead to misleading results if the data is absolutely continuous. This has lead to the 
development of methods using classes of nonparametric prior that guarantee continuous distributions. 
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Figure 1: Illustration of the construction of a Polya tree distribution. Each of the 9^ 
drawn from Beta(ae^Oj Q^emi) 



is independently 



Dirichlet process mixture m odel are one class. Th e calculation of Bayes factors for Dirichlet process- 
based models is discusse d by Basu and ChibI 2003j | . Goodness of fit testing using mixtures of triangular 



distributio ns is given bv|McVinish et al 
sidered bv lBerger and Gughe lmi' [2001*1. 



2009^ . An alternative form of prior, the Polya tree, was con- 



Simple conditions on the prior lead to absolutely continuous 



distributions. iBerger and Gughelrnl [2001,1 develo ps a default approach and considers its properties as 
a conditional frequentist method. iHanso discusses the use of Savage-Dickey density ratios to 



calculate Bayes factors in favour of the centeri ng distribution (see a lso iBranscum and Hanson [2008j). 
Consistency issues are discussed in general by IPass and Led 2004l | and McVinish et al.l 12009 1. There 
has b een less work on testing the hypothesis that two distributions are the same. Pennell and DunsonI 
2008{ develop a Mixture of Dependent Dirichlet Processes approach to testing changes in an ordered 
sequence of distributions. However, rather than using Bayes factors, a tolerance measure approach is 
developed. Of course, Bayesian parametric hypothes is testing where F^^^ and F ^"^^ are of known form 



is well developed in the Bayesian literature, see e.g. iBernardo and Smith 

In the non-Bayesian literature nonparametric hypothesis testing is a mature discipline. Well 
known procedures include th e Wilcoxon Si gned-Rank test and th e Kolmogorov-Smirnov test see e.g. 



2008| . Chapter 5 in Andersen et al.l 1993 [ provides details of associated meth 



Lehmann and Romano! 

ods in survival analysis. However, none of these non-Bayesian procedures provide an explicit proba- 
bilistic measure of P{Ho\y^^'^'') which is our interest here. We would argue that phrasing the test in 
a probabilistic fashion is a natural approach from which to report the evidence for Hq . 

The rest of the paper is as follows. In section 2 we discuss the Polya tree prior and derive the 
marginal probability distributions that result. In section 3 we describe our method and algorithm 
for calculating Pr(_ffo|y*^^'^'') based on subjective priors. In section 4 we discuss an empirical Bayes 
procedure where the Polya Tree priors are centered on the empirical cdf of the joint data. Section 5 
concludes with a brief discussion. 



2 Polya tree priors 



Polya trees form a cla s s of d i stributions for random probability measures F on some domain fl. lLavine 



1992l | , iMauldin et al.l |1992l | , iLavind |1994l | . The Polya tree has a simple constructive formulation that 



we now describe. 

Consider a dyadic (binary) tree that recursively partitions into disjoint measurable sets such 
that at the mth level of the tree we fi nd n = u'^l'^B^"''^ where B^"^ U B^"''' = for all i j. 

1974| . illustrates such a tree up to level 2 where = [0, 1). The 



Figure [11 adapted from iFerguson 

jih. junction in the tree at level i has associated set sj*^ and clearly ijj*' — (Bj*'*'^'', i?2}+V) for all 
Conceptually we should imagine such a tree descending ad infinitum. It will be convenient in what 
follows to simply index the sets using base 2 subscript and drop the superscript so that, for example, 
i?ooo indicates the first set in level 3, Boon the fourth set in level 4 and so on. 



2 



To define a random measure on f2 we construct random measures on the sets Bj. It is instructive 
to imagine a particle cascading down through the tree such that at the jth junction the probability of 
turning left or right is 9j and (1 ~ 9j) respectively. In addition we consider 9j to be a random variable 
with some appropriate distribution 6j ~ ttj. The sample path of the particle down to level m will be 
recorded in a vector Cm = {e»ni, em2, • ■ • , Cmm} with elements Cmi S {0, 1}, such that Cmi = if the 
particle went left at level i, tmi = 1 if it went right. In this way B^^ denotes which partition the 
particle belongs to at the mth level. Given a set of 6'j's it is clear that the probability of the particle 
falling into the set B^^ is just 



p{B,J^\[{e.^)^'-^"\i-e,^^ 



which is just the product of the probabilities of falling left or right at each junction that the particle 
passes through. In this way we have defined a random measure on the partitioning sets. 

The Polya tree is obtained under the following conditions: that the tree descends ad infinitum^ 
level m — > cx), and that the ^j'are random variables with Beta distributions, 9j ^ Be{ajo^aji). To 
be precise, let 11 denote the partition structure defined by the collection of sets 11 = {Bq, Bi, Bqq, . . .) 
and let A denote the collection of parameters that determine the Beta distribution at each junction, 
A. = (anci, anil Qono, ■ ■ ■)■ 

Definition: Lavine 1992| 



A random probability measure F on is said to have a Polya tree distribution, or a Polya tree 
prior, with parameters (11,^), written, F ^ PT(Il, A), if there exists nonnegative numbers A = 
, af \ ttoQ^ . . .) and random variables 9 = {9g^\ 9[^\ 9^^ , . . .) such that the following hold: 

1. all the random variables in Q are independent; 

2. for every e^, 9^^^ ~ Be(ae„o, ae„i); 

3. for every m = 1, 2, . . . and every ei, £2, . . ., 

m 

1=1 

A random probability measure F ~ PT(n,A) is realized by sampling the 0j's from the Beta 
distributions. The set O is infinite dimensional as the level of the tree is infinite and hence for most 
practicable applications the tree is truncated to a depth m. Lavind 1994l | refers to this as a "partially 



specified" Polya tree. It is worth noting that we will not need to make this truncation in what follows 
and hence our test will be fully specified with analytic expressions for the marginal likelihood. 

By defining 11 and A the Polya tree can be centered on some chosen distribution Gq so that 
E[F] — Go where F ^ PT{n,A). Perhaps the simplest way to achie ve this is to place the partitions 



in n at the quantiles of Go and then set a^jO = cteji for all j La vind [1992]. For y e R this leads to 
Bq = {—00, Gq ^(0.5)), Bi = [G(7^(0.5), 00) and more generally at level m. 

Be, - [G„-H(j* - l)/2™},Go-i(r/2™)), (2) 

where j* is the decimal representation of the binary number e^. 

It is usual to set the a's to be constant in a level ae^o — ckc„i — Cm for some constant The 
setting of Cm governs the underlying continuity of the resulting F's. For example, setting Cm = cm^, 
c > 0, implies that F is absolutely continuous wit h probability 1 while Cm = c/2™ defines a Dirichlet 
process whi ch makes F discrete with p robability 1 iLavind 1992j | , Ferguson 1974 1 . We will follow the 



process wm cn maKes f aiscrete witn p robability i i.Lavinei [lijjj^j . ii'ergusoni jiyyjj. we win ioiio\ 
approach of Walker and Mallic^ 1999l | and define Cm = cm?' . The choice of c is left to Section 3 



2.1 Conditioning and marginal likelihood 

An attractive feature of the Polya tree prior is the ease with which we can condition on data. Polya 
trees exhibit conjugacy since given a Polya tree prior F ~ PT(n, A) and a set of data y, the posterior 
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distribution on F is also a Polya tree, -F|y ^PT{Il, A*) where A* is the set of updated parameters, 
A* = {a^f^,a*^,a*f^o,...} with 

a*.\y =a^^ +n^., (3) 

where n^. denotes the number of observations in y that he in the partition B^. . The corresponding 
random variables 9* are therefore distributed a posteriori as 

9*\y ^Be{ajQ + njo,aji +71^1) (4) 

where n^o and n^i are the numbers of observations falling left and right at the junction in the tree 
indicated by j. 

This conjugacy allows for a straightforward calculation of the marginal likelihood for any set of 
observations. A priori we see, 

Pv{y\&,u,A)^Y[e;^°{i-e,r^ (5) 

j 

9j\A Be{ajo,aji) 

where the product in ([5]) is over the set of all partitions, j G {0, 1, 00, . . . , }, though clearly for many 
partitions we have rijQ = riji — 0. Equation ([5]) has the form of a product of independent Binomial- 
Beta trials hence the marginal likelihood is, 

pr(y|n. A) ^ n ( nXr'!?r°t^^^'"^!^"^'r^"\^ ) ■ 

where j G {0, 1,00, ...,}. This marginal probability will form the basis of our test for Hq which we 
describe in the next section. 

3 A procedure for subjective Bayesian nonparametric hypoth- 
esis testing 

We are interested in providing a weight of evidence in favour of Hq given the observed data. From 
Bayes theorem, 

Pr(iJo|y(^'')) ocPr(y(i^2)|i/o)Pr(i?o). (7) 

Recall that the null hypothesis Hq assumes y^^^ and y*-^^ are samples from some common distribution 
pi^'^) with F^^'^'* unknown and we specify our uncertainty in via a Polya tree prior, F^^'^'> ~ 

PT(n,A). 

Under Hi, we assume 

y(i) ^ y(2) - with unknown. A gain we adopt a Polya 

tree prior for F'^^^ and F'^^^ with the same prior parameterization as for F^^'^^ so that 

F(i),F(2),F(i^2)'^Vr(n,^) (8) 

where 11 is cent ered on the quantiles of s o me a priori centering dist ribution (see below). Following 
the approach of IWalker and Mallick iMallick and Walked |2003l | we take common values for the 



ckj's at each level as ajo — aji = m? for in a parameter at level m. The posterior odds of the two 
hypothesis is 

Pr(go|y(^'^)) ^ Pr(y(i.^)|go) Pr(go) 
Pr(i7i|y(i),y(2)) Pr(y(i) ,y(2) Pr(i/i) ^ ' 

where the first term is just the ratio of marginal likelihoods, the Bayes Factor, which from (6) and 
conditional on our specification of 11 and A is 

^(y^^'^^ |go) _ TT ( r(".o)r(^.-i) ^(^.-o + »ff + 4^)r(«,i + nf^ + ng) 

11 V(n,:r,A-r^,A T./„ , „{1) , .J2) , „ , „(1) , (2). ^^^1 



P(y(i),y(2)|i/i) Y \^r(a,o + r(a,.„ + + ng) + + n« + ng^) 
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where the product is over all partitions, j G {0, 1,00, ...,}, n^]^ and n^^-* represent the numbers of 
observations in y'^' falling right and left at each junction and n^Q^ and n^^^ are the equivalent quantities 
for y(^^ . 

The product in (|10p is defined over the infinite set of partitions. However, all terms cancel for which 

three of {^ijo''' "-ji^' "-jo^' "'ji''} zero. That is, to calculate ((TU)) for the infinite partition structure 
we just have to multiply terms from junctions which contain at least some samples going right and 
left. Hence, we only need specify H to the quantile level where partitions contain more than one 
observation. 

Our algorithm is as follows: 

Algorithm 1 Bayesian nonparametric test 

1. Fix the binary tree on the quantiles of some centering distribution G. 

2. Set aj = where m denotes the level in the tree of the corresponding junction. 

3. Add the log of the contributions of terms in PH)) for each junction in the tree that have non-zero 
numbers of observations in y^^'^-* going both right and left. 

4. Report Pr(i7o|y*^'^^) as 

where LOR denotes the log odd ratio calculated at step 3. 



3.1 Prior specification 

The Bayesian procedure requires the specification of {H, A} in the Polya tree. While there are good 
guidelines for setting A the setting of H is more problem specific. Our current, default, guideline is 
to first standardise the joint data y'^'^'' to have mean and standard deviation 1 and then set the 
partition on the quantiles of a standard normal density, H — $(-)~^. We have found this to work well 
as a default in most situations, though of course the reader is encourage to set H according to their 
subjective beliefs. 

3.2 Characteristics 

We can explore the contribution of each term in (jlOp as a function of nj2, k G {0,1}, I G {1,2}. 
We can see from (10) that the overall Bayes Factor has the form of a product of Beta-Binomial tests 
at each junction in the tree to be interpreted as "do you need one Oj or two, {Of^K Oj^'' } , in order 
to model the distribution of the data going left and right at each junction" . Figure [2] shows the 
contribution from terms where n^Q^ = n^^j^ = 0, 1, . . . , n for n = 10, 100, 1000 in the first, second and 
third columns respectively and for a — 0.1, 10 in the first and second rows respectively. We can see 
that as the proportion of data going left moves away from 50% then each term starts to provide 
increasing evidence against the null. 

In Figure [2 the curvature of the log marginal likelihood ratio is changing with n; note changes in 
the vertical scale as n changes. 

In Figure [3] we look at the frequentist distribution of Pr(iJi|y) when y is generated under the 
null y^^-',y*-^^ ~ A/'(0, 1) assuming a priori Pr(iJi) = 0.5. For given sample size n, on the x-axis, we 
repeatedly drew 1000 data sets under the null, calculating the probability assigned to the alternative 
hypothesis for each set. The expected value as a function of sample size along with 90% confidence 
intervals is shown in Figured We can see that the estimator appears to be consistent in converging 
to as n ^ oo though we have been unable to prove this result holds for any Hq : F'^^'^\ 
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Figure 2: Shows the log marginal likelihood ratio as a function of n and a for the symmetric case 
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Figure 3: Shows the frequentist distribution, mean and 90% confidence intervals, for P{Hi\y) as a 
function of sample size, N, when samples are drawn under the null y*-^\y^^^ ~ ^(0,1) for prior 
Pr(i/i) = 0.5. 
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3.3 Simulations 



To examine the operating performance of the method we consider the following experiments designed 
to explore various canonical departures from the null. 

a) Mean shift: Y^^^ - 7V(0, 1), F^^) - 7V(6', 1), 0, . . . , 3 

b) Variance shift: Y'^^I - 7V(0, 1), F^^) _ _/v'(0, 6'^), 6^1,..., 3 

c) Mixture: rt^' - AA(0, 1), F^^) _ i) + iA/'(-6', 1), = 0, . . . , 3 

d) Tails: F^i' ~ A/'(0, 1), F^^) _ t(e»-i), 6* = IQ-^, . . . , 10 

e) Skew: - AA(0, 1), F^^) _ 5_/v'(0, 1, 6i), 6* = 1, . . . , 10 

f) Outlier: F^^) - 7V(0, 1), F^^) _ (i _ 9)N{0, 1) + 6liV(0, 20), 61 = 0, . . . , 1 

g) Lognormal mean shift: logF^^) - 7V(0, 1), logF^^) - J\f{9, 1), = 0, . . . , 3 

h) Lognormal variance shift: logF^^) - A/'(0, 1), logF^^) ^ ^"(0, 9'^), 9^1,..., 3 

where SJ\f{0, 1, A) is the skew normal distribution of skewness parameter A. The default mean dis- 
tribution Fg^'^^ = -^(0, 1) was used in the Polya tree to construct the partition 11 and a = m?. 
Data are standardized. Comparisons are performed with tiq = rii = 50 against the two-sample 
Kolmogorov-Smirnov and Wilcoxon rank test. To compare the models we explore the "power to de- 
tect the alternative" . As a test statistic for the Bayesian model we simulate data under the null and 
then take the empirical 0.95 quantile of the distribution of B ayes Factors as a threshold to declare 
Hi. This is known as "the Bayes, non-Bayes compromise" by iGoodI |l992j . Results are reported in 



Figure m As a general rule we can see that the KS test is more sensitive to changes in central location 
while the Bayes test is more sensitive to changes to tails or higher moments. We also see that the 
subjective Bayes test appears to be inconsistent in some circumstances and fails to detect changes to 
the degrees of freedom of a i-density to a normal. 

One clear advantage of the Bayesian model is that it provides an explicit measure of P{Hi\y). 
The expectation of Pr(i7i|y) arising from the simulations in Figure [H along with 90% frequentist 
confidence intervals, derived from 100 repeated simulations, is shown in FigureO Again, we observe the 
inconsistency of the subjective Bayes test to detect tail changes between a normal and t-distribution; 
with all other tests showing intuitive performance. We investigated a more diffuse distribution for the 
centering distribution of the Polya Tree H = [G^^'^)]"!, such as G^^'^) = jv(0, 2) and G^^'^^ = t{2) but 
the behavior still persists. 

The dyadic partition structure of the Polya Tree allows us to breakdown the contribution to the 
Bayes Factor by levels. That is, we can explore the contribution, in the log of equation (10), by level. 
This is shown in Figure [6] as boxplots of the distribution of log BF statistics across the levels for the 
simulations generated for Figure 21 This is a strength of the Polya tree test in that it provides a 
qualitative and quantitative decomposition of the contribution to the evidence against the null from 
differing levels of the tree. For example, we observe that shifts in central location arc, unsurprisingly, 
detected at the top most level of the tree, while changes to the tails or variances are detected further 
down in the quantilcs or below. This provides the statistician with a useful gauge on where signal 
against the null is coming from. 

We next explore sensitivity to the prior parameters a = cm? by changing the constant such that 
a = 10™^. Figures [7] and [S]show the corresponding results analogous to Figure U] and Figure [5] with 
a — m? . Increasing the constant places greater contribution to the higher levels of the tree and hence 
we find that setting c = 10 improves the precision to detect a shift in central location at the expense 
to sensitivity to lower quantiles such as tail and higher moment detection. 



4 An empirical Bayes procedure 

The Bayesian procedure above requires the subjective specification of the partition structure 11. This 
subjective setting may make some users uneasy regarding the sensitivity to specification. Moreover, 
we have seen for certain tests under Hi the subjective test performs poorly. In this section we explore 
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an empirical procedure, akin to empirical Bayes, whereby the partition 11 is centered on the data via 



the empirical cdf of the joint data H 



-1 



This is akin to an empirical Bayes procedure with 



a flat prior over 7r(n) and the setting 11 to the marginal MAP estimate. The empirical cdf maximises 
the marginal likelihood when using symmetric Be{aj , aj) priors, as a priori we expect to see equal 
numbers of observations going left or right at any junction in the tree. This provides a default setting 
for the partition. 

Let n be the partition constructed with the quantiles of the empirical distribution Ft^^-^) y(i.2)^ 
Under Hq, there are now no free parameters and only one degree of freedom in the random variables 
{nj^o , ri'j^i , ri'fg , ri'fi } as conditional on the partition centered on the empirical cdf of the joint, once 
one of variables has been specified the others are then known. We consider, arbitrarily, the marginal 
distribution of {n^Q^} which is now a product of hypergeometric distributions (we only consider levels 
where > 1) 




Pr{{n%^}\Ho,n,A) (X J] ^ ' / ^ Z ^ ' (12) 

' ( ) 

= n HyvGeo{nf-nf\nf,Tlf;^'^) (13) 

3 

■ r /n (12) , (1) (12)n / (1) / • / (1) (12)x n j-U 

it max(U, rijQ + n)^ ~ n]j ) < rij < mm(n^- , n^-Q j, U otherwise. 

Under Hi, the marginal distribution of {'T-jo^} is a product of the conditional distribution of 
independent binomial variates, conditional on their sum, 

„/„(l).„(12) „(1) „(12) ^(1) ^(2)n 

rrun^o oc jj_ . (12) (1) (12) .(1) .(2)^ ^^'^) 

J 2^x9[x,nj ,n- ,n-^ ) 

if max(0, ^^jo^'' + n^^^^ — n^j^^"^) < n^j^"^ < min{n^j^\ n^Q^"*), otherwise, and where 



and 



9^f\A - Be{ajo,aji) ef\A ~ Be{ajo,aji) 



Now, consider the odds tOj ~ and let 



E.5(x;nf\nW,n(f,^«,^f) 



The n it can been se e n that W(x; N,m,n ,ui) is the Wallenius noncentral hypergeometric distribu- 
tion Walleniui 1963j | , Johnson et al. 2005l | whose pdf is 

i'^/^)-(l - t'/'T-^dt 

where D = LLj{m — x) + {N — m — n + x). Note there are C-I--I- and R routines to evaluate the pdf 
0. Wallenius noncentral hypergeometric distribution models a biased urn sampling scheme whereby 



^See references ther e 
[http: / /en.wikipedia.org/wiki/Wallenius'-noncentraLhypergeometric-distribution| 
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there is a different likelihood of drawing one type of ball over another at each draw. The Bayes factor 
is now given by 

BF II HypGeojnf^ ; nf^ , nf , n^^^ ) ^^^^ 

where the marginal likelihood in the denominator can be evaluated using importance sampling or 

one-dimensional quadrature. 

The empirical Bayes two-sample test can then be given as: 



Algorithm 2 Empirical Bayes nonparametric test 

1. Fix the binary tree on the quantiles of the empirical distribution F'^^''^\ 

2. Set aj = vn? where m denotes the level in the tree of the corresponding junction. 

3. Add the log of the contributions of terms in (|15p . evaluated using importance sampling or 
quadrature, for each junction in the tree that have non-zero numbers of observations in y^^'^^ 
going both right and left. 

4. Report Pr(ffo|y^^'^^) as 

where LOR denotes the log odd ratio calculated at step 3. 



We repeated the simulations from Section 3.3 with a — m? . The corresponding results are shown 
in Figures m [TUl HH We observe similar behaviour to the subjective test but importantly we see that 
the problem in detecting the difference between normal and t-distribution is corrected. Moreover, no 
standardisation of the data is required for this test. 



5 Conclusions 



We have described a Bayesian nonparametric hypothesis test for real valued data which provides an 
explicit measure of Pr(i?o|y(i'2)). The test is based on a fully specified Polya Tree prior for which we 
are able to derive an explicit form for the Bayes Factor. The choice of the partition is quite crucial for 
the subjective Bayes test. This is a well known phenomena of Polya Tr e e prio rs and some intere stin g 
directions to m itigate its effects can be found in Hanson and Johnson 2002 1. IPaddock et al.l |2003t . 
HansonI [2006j . To this aim we also provided an automated empirical Bayes procedure which centres 
the partition on the empirical cdf of the joint data which was seen to rectify problems in the subjective 
Bayes test. 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 




(g) lognormal: mean (h) lognormal: variance 



Figure 4: Power of Bayes Polya Tree prior test with Uj = tested on simulations from Section 3.2 
with Gaussian distribution with varying (a) mean (b) variance (c) Mixture (d) tails (e) skewness (f) 
outlier. Log-normal distribution with varying (g) mean (h) variance. The x-axis measures the value 
of 6, the parameter in the alternative given in Section 3.2. Legend: Kolmogorov-Smirnov test (red 
dashed), Wilcoxon (red dot-dashed), Bayesian nonparametric test (solid blue). 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 




(g) lognormal: mean (h) lognormal: variance 

Figure 5: Expected probability of Hi, with 90% frequentist confidence intervals by repeatedly applying 
the Bayes Polya Tree prior test with aj = m? for tests show in Figure H) x-axis records the value of 9 
the parameter of the alternative on same scale as Figure S) 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 



I I i 



(d) Gaussian: tail 



^ 4 * * 

(g) lognormal: mean 



^^^^^ 1 



(e) Gaussian: skewness 



(h) lognormal: variance 



^ ^ ^ ^ — — - 



(f) Gaussian: outlier 



Figure 6: Contribution to Bayes Factors from different levels of the Polya Tree under the alternative. 
Gaussian distribution with varying (a) mean (b) variance (c) Mixture (d) tails (e) skewness (f) outlier; 
Log-normal distribution with varying (g) mean (h) variance, from Section 3.2. Parameters of Hi were 
set to the mid-points of the x-axis in Figured] 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 




Figure 7: Same as Figure|l]but now with aj = lOm^. Gaussian distribution with varying (a) mean (b) 
variance (c) Mixture (d) tails (e) skewncss (f ) outlier. Log-normal distribution with varying (g) mean 
(h) variance. Legend: Kolmogorov-Smirnov test (red dashed), Wilcoxon (red dashed-dot), Bayesian 
nonparametric test (solid blue) 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 




(d) Gaussian: tail (e) Gaussian: skewness (f) Gaussian: outlier 




(g) lognormal: mean (h) lognormal: variance 

Figure 8: Contributions to BFs using aj = lOm^. Gaussian distribution with varying (a) mean 
(b) variance (c) Mixture (d) tails (e) skewness. Log-normal distribution with varying (f) mean (g) 
variance. Legend: Kolmogorov-Smirnov test versus Bayesian nonparametric test. Parameters of Hi 
were set to the mid-points of the x-axis in Figure [H 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 




Figure 9: Empirical Bayes Test using aj = . Power to detect Gaussian distribution with varying 
(a) mean (b) variance (c) Mixture (d) tails (e) skewness (f) outlier and Log-normal distribution with 
varying (g) mean (h) variance; as in Section 3.2. Legend: Kolmogorov-Smirnov test (red dashed), 
Wilcoxon (red dot-dashed), Bayesian nonparametric test (solid blue). 



16 



(a) Gaussian: mean (b) Gaussian: variance (c) Gaussian: mixture 




(d) Gaussian: tail (e) Gaussian: skewness (f) Gaussian: outlier 




(g) lognormal: mean (h) lognormal: variance 

Figure 10: Expected probability of empirical Bayes test for iJi, with 90% confidence intervals by 
repeatedly applying the Bayes Polya Tree prior test with aj = rr? on Gaussian distribution with 
varying (a) mean (b) variance (c) Mixture (d) tails (e) skewness (f) outlier and log-normal distribution 
with varying (g) mean (h) variance, as in Section 3.2. 
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(a) Gaussian: mean 



(b) Gaussian: variance 



(c) Gaussian: mixture 



a 



(d) Gaussian: tail (e) Gaussian: skewncss (f) Gaussian: outlier 




(g) lognormal: mean (h) lognormal: variance 



Figure 11: Contribution to the Bayes Factor for differing levels of the empirical Polya Tree prior for 
Gaussian distribution with varying (a) mean (b) variance (c) Mixture (d) tails (e) skewness (f) outlier. 
Log-normal distribution with varying (g) mean (h) variance. 
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